Tag
3 articles
Explore the significance of Hugging Face's TRL v1.0, a unified framework for aligning large language models through post-training techniques like SFT, Reward Modeling, DPO, and GRPO.
Learn to implement and use State Space Models with the Mamba architecture, focusing on Mamba-3's 2x smaller states and enhanced hardware efficiency.
Inception has launched Mercury 2, the first diffusion-based language reasoning model that processes entire passages in parallel, making it more than five times faster than traditional models.